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PURINE-REGION DNA BINDING PROTEIN 
BACKGROUND OF THE INVENTION 



Technical Field 

The present invention relates, in general, 
to a DNA binding protein and to a DNA sequence 
encoding same. In particular, the invention relates 
to a GA binding protein and to DNA segments encoding 
the subunits thereof. 

Background Information 

Herpes simplex virus l (HSVl) immediate 
early (IE) genes are induced at the outset of the 
lytic infection by a virion associated protein 
termed VP16 (Post et al, Cell 24, 555 (1981)). At 
least two classes of cis-regulatory elements qualify 
HSV IE genes for induction by VP16. The most 
essential VP16 cis-response element is characterized 
by the nonanucleotide sequence 5 1 -TAATGARAT-3 • 
(Mackem et al, Proc Natl, Acad. Sci. U.S.A. 79, 4917 

(1982) ; Mackem et al, J. Virol. 44, 939 (1982); 
Cordingley et al, Nucleic Acids Res. 11, 2347 

(1983) ; Kristie et al, Proc. Natl. Acad. Sci. U.S.A. 
81, 4065 (1984); Gaffney et al, Nucleic Acids Res. 
13, 7847 (1985); Bzik et al, ibid. 14, 929 (1986); 
O^are et al, J. Virol. 61, 190 (1987); and 
Triezenberg et al, Genes Dev. 2, 730 (1988)). VP16 
binds tightly to this DNA sequence in a complex with 
the cellular transcription factor Octl (Preston et 
al, C 11 52, 425 (1988); 0»Hare et al, ibid. p. 435 
(1988); and Gerster et al, Proc. Natl. Acad. Sci. 
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U.S.A. 85, 6347 (1988)). A second cis-regulatory 
element required for VP16-mediated induction of HSV 
IE genes consists of three imperfect repeats of the 
purine-rich hexanucleotide 5'-CGGAAR-3' (Triezenberg 
5 et al, Genes Dev. 2, 730 (1988) and Spector et al f 
iJbid. 87, 5268 (1990)). A protein complex capable 
of avid interaction with the purine-rich repeats (GA 
repeats) has been identified in soluble preparations 
of rat liver nucleic (Triezenberg et al, Genes Dev. 

10 2, 730 (1988)). This GA binding protein (GABP) 
consists of two separable subunits. Purified 
samples of either subunit do not interact with the 
GA repeats, yet regain potent DNA binding activity 
when mixed (LaMarco et al, Genes Dev. 3, 1372 

15 (1989)). 

Applicants have isolated cDNA clones 
encoding both subunits of GABP and have revealed 
that one (GABPa) is related to the Ets transforming 
protein, while the other (GABPB) contains a series 

20 of 33-amino acid repeats related in sequence to a 
variety of proteins including Notch of Drosophila 
melanoaaster , Linl2 and Glpl of CaenQyhabdjtis 
eleaans and SW14 and SW16 of Saccharomvces 
cerevisiae (Wharton et al, Cell 43, 567 (1985); 

25 Greenwald, ibid 43, 583 (1985); Yochem et al, Nature 
335, 547 (1988); Yochem et al, Cell 58, 553 (1989); 
Austin et al, ibid. , p. 565 (1989); Breeden et al, 
Nature 329, 651 (1987); and Andrews et al, ibid. 
342, 830 (1989)). In addition, Applicants have 

30 demonstrated that these two protein sequence motifs, 
the Ets-related domain of GABPor and the 3 3 -amino 
acid repeats of GABPB, contribute the surfaces that 
form a multiprotein complex capable of stable and 
specific interaction with DNA. These findings, 

35 which form the basis of the present invention, 
provide insight into the problem of regulatory 
specificity and define, for the first time, a 
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discrete function for the 3 3 -amino acid repeat 
motif 

SUMMARY OF THE INVENTION 

It is a general object of the invention to 
5 provide DNA segments encoding the subunits of GABP. 

In one embodiment, the present invention 
relates to a DNA segment encoding GABPa, GABPfil or 
GABPS2, or portion thereof. 

In a further embodiment, the present 
10 invention relates to a construct comprising at least 
one of the above-described segments and to a host 
cell transformed therewith. 

Further objects and advantages of the 
present invention will become clear from the 
15 description that follows, 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure l. Amino acid sequences of tryptic 
peptides derived from GA binding proteins, GABP (20 
fig) was purified to homogeneity (inset) as described 

20 (LaMarco et al, Genes Dev. 3, 1372 (1989)) except 

that boiled salmon sperm DNA (20 /ig/ml) was included 
as a non-specific competitor in the DNA affinity 
chromatographic step. Approximately 500 picomoles 
of protein was lyophilized, reduced, acetylated, and 

25 subjected to cleavage by trypsin (Boehringer 

Mannheim) . The resulting peptides were separated by 
reverse-phase HPLC as described (Stone et al, 
Laboratory Methodology in Biochemistry, Fini et al 
eds (RC Press, Boca Raton, FL, 1991)). Amino acid 

30 sequence analysis was performed on a Vydac C18 

column (Appli d Biosystems 4 77 -A) . The amino acid 
sequences derived from peaks 1-13 w re: 1, 
SLFDQGVIEK; 2, ?AWALEGY; 3, DEIS? VGDEGEFK ; 4, 
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ELESLNQEDFFQR; 5, LQESLDAHEIELQDIQL?P?R ; 6, 

DQISIVGDEGEFK; 7, MAELV; 8, YVQASQLQQMNEIVTIDQP; 9, 

TPLHMWASEGHA; 10, GEILWS; 11, LIEIEIDGTEK; 12, 

ILMANGAPFTTD; 13, TGNNGQIQL?QFLLEL?TDR. 

5 Figure 2- Nucleotide and deduced amino acid 

sequences of cDNAs encoding GABP subunits. (A) 
Sequence of GABPq . (B) Sequences of GABPB1 and B2. 
An unamplified cDNA library prepared from mouse 
adipocyte mRNA was screened with a mixture of 

10 degenerate oligonucleotides derived from the amino 
acids sequences of peptides 3, 4, 5, and 8 (Fig. 1) 
labeled with 32 P using polynucleotide kinase. The 
basic SSC protocol was used (Ausubel et al, Current 
Protocols in Molecular Biology (Wiley & Sons, NY), 

15 1989). Hybridization was performed at 48°C for 

16 hrs. GABPB1 and 62 were isolated by screening a 
day-8.5 mouse embryo cDNA library (Lee, Mol. 
Endocrinol. 4, 1034 (1990)) with degenerate 
oligonucleotides corresponding to peptides 9 and 12 

20 (Fig. 1) . Kinased oligonucleotide probes were 
hybrided in 6X SSC, IX Denhardt's, 0.05% sodium 
pyrophosphate, and 100 Atg/ml yeast tRNA at 50°C for 
14 hours. Washing conditions were 6X SSC, 0.05% 
sodium pyrophosphate at 55°C. A total of five 

25 clones were isolated that hybridized with both 

oligonucleotide probes. Four of the clones were 
approximately 2.6 kb and differed only slightly in 
the length of the 5' untranslated region; these cDNA 
clones encoded GABPB1 .. The fifth cDNA clone was 

30 approximately 1.4 kb and differed from the other 

four at its 3' end; this cDNA clone encoded GABPB2. 
Four additional cDNA clones corresponding to GABPB2 
were subsequently identified. DNA sequencing was by 
the dideoxy chain termination method (Sanger et al, 

35 Proc. Natl. Acad. sci. USA 74, 5463 (1977)) using 
Sequenase (U.S. Biochemicals) under conditions 
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suggest d by the manufacturer. The complete 
nucleotide sequences of GABPa , £1 and 62 were 
d t rmined on both DNA strands using deleted 
templates or synthetic oligonucleotide primers. 
5 Deletions were made using exonuclease III 

(Pharmacia) under conditions specified by the 
manufacturer. The sequences for GAB FBI and GABP62 
were identical up to nucleotide 1130 except for a 
three nucleotide insertion (GTA) at position 828 of 

10 GABPB1. Sequencing of four other independent 

isolates of GABPfil were identical to GABPB2 at this 
site. Peptides identified by amino acid sequencing 
of purified GABP are underlined in the deduced amino 
acid sequences. The dashed lines indicate the 

15 sequence in GABPB1 not found in GABPB2 . The 

sequence for B2 is shown from the point at which it 
diverges from fil. 

Figure 3. Tissue distribution of GABP mRNAS. RNA 
was isolated from various rat tissues (Chingwin et 

20 al, Biochemistry 18, 5294 (1979)) and mouse L cells 
(Chomczynski et al, Anal. Biochem. 162, 156 (1987)). 
lOjig of poly A+ RNA was separated on a 1% agarose- 
formaldehyde gel, transferred to Nytran (Schleicher 
and Schuell) and hybridized with a random-primed 

25 probe prepared from GABPa (A) or GABPB1 (B) . 

Figure 4. Requirement of GABPa and GABPBl for 
sequence specific DNA binding. A) In vitro 
translation of GABP proteins. Sense and anti-sense 
RNAs from GABPa and GABPB were transcribed in vitro 

30 and translated in rabbit reticulocyte lysates (Krieg 
et al, Nucl. Acids Res. 12, 7057 (1984)). Plasmids 
with cDNAs inserted in the Eco RI and Xho I sites of 
Bluescript (Stratagene) were linearized with Asp 718 
and transcribed with T3 RNA polymerase to generate 

35 sense strand RNA, or linearized with Bam HI and 
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transcribed with T7 RNA polymerase to generate anti- 
sense RNA. RNAs were used to program rabbit 
reticulocyte lysates in the presence of "S- 
methionine under conditions specified by the 
5 manufacturer (Promega Biotec) . Unlabeled protein 
was used for DNA binding experiments. The "s- 
methionine labeled products were separated on a 
12.5% SDS-polyacrylamide gel and visualized by 
f luorography; (-) , anti-sense RNA; (+) , sense strand 
10 RNA. Positions of molecular weight markers are 

indicated in kD. (B) Electrophoretic mobility shift 
assays with in vitro translated GABP proteins. 
Proteins were incubated in the presence of a H P- 
labeled DNA fragment from the HSV ICP4 promoter and 
15 subjected to electrophoresis on a non-denaturing 5% 
polyacrylamide gel in .5X TBE (Garner et al, Nucl. 
Acids Res. 9, 3047 (1981); Fried et al, ibid, p. 
6505 (1981)). For DNA binding assays, samples 
containing in vitro translated protein were 
20 incubated in 25 mM Tris pH 8.0, 10% glycerol, 50 mM 
CK1, 3 mM MgC12, 0.5mM EDTA, lmM DTT, 50 ng/ml poly 
dldC on ice for 10 minutes, then probe was added and 
incubation continued at room temperature for 
10 minutes. The probe was a 180 bp Nco I-Sal I 
25 fragment excised from the herpes simplex virus ICP4 
promoter. The fragment was labeled by fill-in with 
the Klenow fragment of DNA polymerase I in the 
presence of M P-dCTP. Protein: DNA complexes were 
subjected to electrophoresis on 5% (30:1) 
30 polyacrylamide gels in 0.5X TBE. Radioactive DNA 
and DNA: protein complexes were visualized by 
autoradiography. M B M indicates GABP bound DNA, "E M 
indicates DNA bound by proteins endogenous to 
reticulocyte lysates. 

35 Figure 5. Sch matic diagram of GABP subunits 

showing regions of amino acid sequence similarity to 
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related proteins. (Top) GABPa is represented as a 
rectangle with the NH,-terminus on the left and the 
COOH-terminus on the right. The region of sequenc 
similarity to Ets-related proteins is shaded (amino 
acids 316-400) and compared with the sequences of 
Ets-l (Gunther et al, Genes Dev. 4, 667 (1990)), Erg 
(Reddy et al, Proc. Natl. Acad. Sci. U.S.A. 84, 6131 
(1987)), Elk (Rao et al, Science 244, 66 (1989) and 
E74A (Burtis et al, Cell 61, 85 (1990). Residues 
that are common to GABPa and other proteins are 
boxed in black. (Bottom) GABPBl is represented as a 
rectangle with the NH 2 -terminus on the left and the 
COOH-terminus on the right. The 3 3 -amino acids 
repeats are shown as shaded rectangles. The unique 
COOH-terminal segment of GABPBl relative to GABPB2 
is indicated in black (333-382). The sequence of 
the four 33 amino acid repeats in GABPBl are shown 
below; residues that are common to two or more 
repeats are boxed in black and used to derive the 
GABPB consensus. Similar criteria were used to 
derive consensus sequences for the 33 amino acid 
repeats of cdc 10/SW14,6 (Ares et al, EMBO J. 4, 457 
(1985); Andrews et al, Nature, 342, 830 (1989); 
Breeden et al, Nature 329, 651 (1987)), Notch 
(Wharton et al, Cell 43, 567 (1985); Greenwald, 
ibid. 583 (1985)), glpl (Yochem et al, Cell 58, 553 

(1989) ; Austin et al iJbid. p. 565 (1989)), linl2 
(Yochem et al, Nature 335, 547 (1988)), ankyrin (Lux 
et al ibid. 344, 36 (1990)), NF B Kieran et al, Cell 
62, 1007 (1990); Bours et al. Nature 348, 76 

(1990) ), feml (Yochem et al, Cell 58, 553 (1989; 
Austin et al ibid., p. 565 (1989)), and bcl-3 (Ohno 
et al ibid., p. 991 (1990)). The repeats from 
cdcio, SW14 and SW16 were combined to determine the 
consensus. The consensus for ankyrin was taken from 
Lux et al, Nature 344, 36 (1990). The overall 
consensus was defined as residues present in at 
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least 6 of the individual consensus repeat 
sequences . 

Figure 6. DNA binding by GABP expressed in 
bacteria. Purified proteins were incubated with a 
5 "P-labeled oligonucleotide containing the GABP 

binding site derived from the enhancer of the herpes 
simplex virus ICP4 gene (5 f - 

TGCGGAACGGAAGCGGAAACCGCCGGATCG-3 ' ) (Triezenberg et 
al, Genes Dev. 2 (1988); LaMarco et al, ibid. 3, 
10 1372 (1989)). Free and protein-bound DNA samples 
were subjected to electrophoresis on 5% 
polyacrylamide gels in either 0.5X TBE (A) or 0.25X 
TBE buffer (B) . 

Figure 7. Characterization of the DNA binding site 

15 for GABP. (A) Increasing concentrations of GABPa, 
either in the absence (left panel) or presence 
(right panel) of GABPfil were mixed with a "P-labeled 
DNA fragment derived from the herpes simplex virus 
ICP4 enhancer. Free and protein-bound complexes 

20 were partially digested with DNase I and subjected 
to electrophoresis on an 8% polyacrylamide 
sequencing gel. The positions of three purine-rich 
repeats within the region of DNA protected from 
digestion by GABP are indicated by arrows. Lanes 1- 

25 6 (left panel) show digestion patterns resulting 
from GABPa concentrations starting at 1.5 nM and 
decreasing in 3-fold increments to 0.005 nM. Lanes 
1-6 (right panel) show patterns resulting from 
addition of the same concentrations of GABPa that 

30 had been supplemented with 0.5 nM of GABPBl. (B) 

Methylation protection (left panel) and interference 
(right panel) assays of DNA binding by GABP. The 
same DNA fragment used in (A) was incubated with 
GABPa, GABPBl, or an eguimolar mixture of the two 

35 subunits, and exposed to dimethyl sulfate (DMS) . 
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Partially m thylated DNA was recovered, cleaved with 
piperidin , and run on an 8% polyacrylamide 
sequencing gel* For methylation interference 
assays , DNA was partially methylated, incubated with 
5 an equimolar mixture of GABPa and GABPB1, and 

subjected to electrophoresis on a non-denaturing 
polyacrylamide gel as described in Fig. 6. Free and 
protein bound DNA species were recovered, cleaved 
with piper idine, and electrophoresed on an 8% 

10 polyacrylamide sequencing gel. Nucleotide residues 
closely contacted by GABP are shown in the lower 
part of (B) . Filled circles identify guanine 
residues that were protected from DMS by GABP. 
Methylation of the same four guanine residues also 

15 inhibited DNA binding by GABP. Open circles 

identify adenine residues where methylation is 
enhanced in the presence of both GABPa and GABPfil . 

Figure 8. Measurements of DNA binding stability of 
complexes formed by various mixtures of GABP 

20 subunits. A 32 P-labeled oligonucleotide containing a 
GABP binding site (Fig. 6) was incubated with GABPa 
alone, or together with equimolar amounts of either 
of the two B subunits. After a 10 minute incubation 
at 24 °C, protein: DNA complexes were challenged with 

25 a 500-fold excess of unlabeled oligonucleotide. 

Protein-bound and free DNA were separated by non- 
denaturing gel electrophoresis as described in Fig. 
6A. Protein-bound and free DNAs were located by 
autoradiography, excised, and quant itated by 

30 scintillation spectrometry. Results are presented 

as fraction of probe bound, normalized to 1.0 at the 
start point (time = 0) . 

Figure 9. UV-mediated crosslinking of GABP subunits 
to DNA. Isolated or mixed GABP subunits were 
35 incubated with a M P- labeled oligonucleotide 
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containing a GABP binding site (Chodosh in Current 
Protocols in Molecular Biology, Vol. II, Ausubel et 
al eds (Greene Wiley, New York, 1988)) then exposed 
to ultraviolet light for varying lengths of time. 
5 UV crosslinking was performed using an 

oligonucleotide composed of a GA binding site 
flanked by 10 bp of non-specific sequence (5 1 
AACCAAGCTT GCGGAACGGAAGCGGAAACCG 3 1 ) corresponding to 
residues located between 280 and 300 bp upstream of 
10 the herpes simplex virus gene encoding ICP4. 

Oligonucleotides were labeled to high specific 
activity by fill-in reaction with the Klenow 
fragment of DNA polymerase I in the presence of all 
four 32 P-labeled dNTPs. DNA binding reactions were 
15 performed as described in a 96-well culture dish, 

followed by exposure to ultraviolet light. Samples 
were boiled in SDS-sample buffer and subjected to 
electrophoresis on SDS-polyacrylamide gels. 
Crosslinked protein species were visualized by 
20 autoradiography. Samples were denatured by boiling 
in SDS sample buffer and subjected to 
electrophoresis on a denaturing 12.5% polyacrylamide 
gel. Following electrophoresis the gel was dried 
and exposed to X-ray film. Time of exposure to UV 
25 light (minutes) is indicated above each gel lane. 

Migration of molecular weight marks (kD) is shown on 
the left. 

Figure 10. Glutaraldehyde crosslinking of GABPB1 
and GABPB2 subunits. Bacterially synthesized 
proteins were incubated in phosphate buffered saline 
with varying concentrations of glutaraldehyde as 
indicated below each lane for five minutes at room 
t mperature. Samples were denatured by boiling in 
SDS sample buffer and subject to electrophor sis on 
a denaturing 10% polyacrylamid gel. Following 
electrophoresis the gel was stained with Coomassie 

10 
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brilliant blue. Proteins present in crosslinking 
reactions are indicated above each lane. BN110 is a 
truncated version of GABPBl missing no NH 2 -tenainal 
residues (see Fig. 12B) . 

5 Figure 11. DNA binding and complex formation assays 
of deleted variants of GABPa. Top panel shows 
schematic representation of GABPa deletion mutants, 
individual mutants are designated according to the 
position of deletion end points with respect to the 
10 amino acid sequence of GABPa . Prefix "N» designates 
deletions missing residues starting at the NH 2 - 
terminus of GABPa, prefix M c» designates deletions 
missing COOH-terminal residues, numbers indicate the 
position of the amino acid at which the deletion 
15 terminates. The Ets-related segment of GABPa is 

highlighted by grey stippling. Bottom panel shows 
an autoradiographic image of a non-denaturing gel 
used to separate DNA: protein complexes formed 
between variants of GABPa, GABPBl and a "P-labeled 
oligonucleotide that contained a GABP binding site. 
Each variant of GABPa was tested for DNA binding in 
the absence and presence of GABPBl as indicated 
above the individual lanes. 

Figure 12. Complex formation and UV crosslinking 
25 assays of deleted variants of GABPBl. Top panels of 
(A) and (B) show schematic representations of GABPBl 
deletion mutants. Individual mutants are designated 
according to the positions of deletion end points 
with respect to the amino acid sequence of GABPBl. 
Prefix "N M designates deletions from the NH 2 - 
terminus of GABPBl (B) , prefix "C" designates 
deletions missing COOH-terminal residues. Repeated 
sequences 33 or 32 amino acids in length that are 
related to similarly sized repeats in th Notch 
protein of DrpSPpnila melanooastgi- are highlighted 
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by grey stippling. Unique parts of GABPB1 and 
GABP62 are indicated by black and hatched rectangles 
at their respective COOH-termini. Deleted formation 
with GABPa as shown in the lower left panels of (A) 
5 and (B) . Each deletion mutant was also tested in UV 
crosslinJcing assays shown in the lower right panels 
of (A) and (B) . All complex formation and UV 
crossl inking assays were conducted in the presence 
of GABPa and a w P-labeled oligonucleotide containing 
10 a GABP binding site. 

Figure 13. Model depicting complex formed between 
GABP and DNA. The sequence of the GABP binding site 
consists of two hexanucleotide repeats of the 
sequence S'-CGGAAR-S 1 as in lower part of Fig. 13. 

15 Oval spheres directly above guanine residues of each 
hexanucleotide correspond to GABPa subunits, 
elongated rectangles correspond to 33 amino acid 
repeats of GABPB subunits. Smaller rectangles shown 
at top correspond to the region of GABPB 1 required 

20 for formation of stable homodimers. Circular arrows 
designate flexible regions inferred to occur between 
the dimer forming region of GABPB 1 and the 3 3 -amino 
acid repeats located at its NH 2 -terminus. 

DETAILED DESCRIPTION OF THE INVENTION 

25 The present invention relates to a DNA 

segment encoding all (or a unique portion) of the 
heteromeric transcriptional regulatory protein 
termed GA binding protein (GABP) . The invention 
further relates to the encoded proteins (or 

30 polypeptides) . A "unique portion" as used herein 
consists of at least five (or six) amino acids or, 
correspondingly, at least 15 (or 18) nucleotides. 
The present invention further relates to a 



12 



WO 93/04166 



PCT/US92/06748 



recombinant DNA mol cule comprising the above DNA 
segment and to host cells transformed therewith. 

In particular, the present invention 
relates to a DNA segment that encodes the entire 
5 amino acid sequence of GABPa, GABP61 or GABP62 given 
in Figure 2 (the specific DNA segments given in 
Figure 2 being only examples) , or any unique portion 
thereof. DNA segments to which the invention 
relates also include those encoding substantially 
10 the same protein subunits as shown in Figure 2, 
including, for example, allelic and species 
variations thereof and functional equivalents of the 
amino acid sequences of Figure 2. The invention 
further relates to a DNA segment substantially 
15 identical to one of the subunit sequences shown in 
Figure 2. A "substantially identical" sequence is 
one the complement of which hybridizes to one of the 
sequences of Figure 2 at 50° C and 6X SSC 
( saline/ sodium citrate) and which remains bound when 
20 subjected to washing at 55°C with 6X SSC (note: 20 
x SSC = 3M sodium chloride/0.3 M sodium citrate). 
The invention also relates to nucleotide fragments 
complementary to such DNA segments. Unique portions 
of the DNA segment, or complementary fragments, can 
25 be used as probes for detecting the presence of 
respective complementary strands in DNA (or RNA) 
containing samples. 

The present invention further relates to 
GABP, and subunits thereof, substantially free of 
30 proteins with which it is normally associated, and 
more especially, to unique peptide fragments of the 
subunits of that protein. The GABP protein (or 
functionally equivalent variations thereof) , or 
p ptide fragments thereof, to which the invention 
35 relates, also includes those which are chemically 
synthesized using known methods. The proteins and 
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peptides of the present invention can be modified, 
for example, phosphorylatad, or unmodified. 

The present invention also relates to 
recombinant ly produced GABP, or subunits thereof, 
having the amino acid sequence shown in Figure 2 or 
functionally equivalent variation thereof. The 
recombinantly produced protein can be modified, for 
example phosphorylated, or unmodified. The present 
invention, more particularly, relates to 
recombinantly produced unique peptide fragments of 
GABP subunits. 

The present invention also relates to a 
recombinant DNA molecule (or construct) and to a 
host cell transformed therewith. Using standard 
methodologies, well known in the art, a recombinant 
DNA molecule comprising a vector and a DNA segment 
encoding at least one GABP subunit, or a unique 
portion thereof, can be constructed. Vectors 
suitable for use in the present invention include 
plasmid and viral vectors. The vector can be 
selected so as to be suitable for transforming 
prokaryotic or eukaryotic cells. Advantageously, 
the recombinant molecule includes a promoter 
operably linked to the GABP encoding segment." 

The recombinant DNA molecule of the 
invention can be introduced into appropriate host 
cells by one skilled in the art using method well 
known in the art. Suitable host cells include 
prokaryotic cells, such as bacteria, lower 
eukaryotic cells, such as yeast, and higher 
eukaryotic cells, such as mammalian cells. These 
cells can serve as a source of GABP when cultured 
under appropriate conditions. 

As noted at the outset, and as will be 
further described in the Examples that follow, 
significant amino acid sequence similarity exists 
between GABPa and the products of the estl and ets2 
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(Watson et al, Proc. Natl. Acad. Sci. U.S.A. 85, 
7862 (1988); Gunther et al, Genes Dev. 4, 667 
(1990)) proto-oncogenes. The Ets-related r gion of 
GABPa is located close to the COOH-terminus of the 
subunit. Biochemical studies of Etsl (Gunther et 
al, Genes Dev. 4, 667 (1990)), as well as the 
related proteins PU (Klemsz et al, Cell 61, 113 
(1990)) and E74 (Urness et al, Cell 63, 47 (1990)), 
have demonstrated direct, sequence-specific DNA 
binding. These proteins, as well the products of 
several additional eukaryotic genes, share sequence 
similarity in an 85-amino acid region that is 
required for DNA binding (Karim et al, Genes Dev. 4, 
1451 (1990)). The region of GABPa that is related 
to this family of proteins coincides with the 
85 amino acid DNA binding domain (Fig. 5) . 

The amino acid sequences of GABPB1 and 
GABPB2 contain four repeats of a related amino acid 
sequence located at the NH 2 -termini of both subunits 
(Fig. 5) . The first two repeats are 32 amino acids 
in length and the second two contain 33 amino acids. 
Similar repeats occur in the Notch protein of 
Drosophila melanocraster (Wharton et al, Cell 43, 567 
(1985); I. Greenwald, ibid. , 583 (1985)), and the 
Linl2 and Glpl proteins of Caenorhabditis eleaans 
(Yochem et al, Nature 335, 547 (1988), Yochem et al, 
Cell 58, 553 (1989); Austin et al, iJbid. , p. 565 
(1989)). These "33-amino acid repeats" were first 
recognized in studies of the yeast protein SW16, 
which regulates gene expression involved in mating 
type switching (Breeden et al, Nature 329 , 651 
(1987)). Similar repeats have been identified in 
ankrin, a multifunctional protein associated with 
the membrane of red blood cells (Lux et al, Nature 
344, 36 (1990)), several vaccinia virus encoded 
proteins of unknown function (Gillard et al, Proc. 
Natl. Acad. Sci. U.S.A. 83, 5573 (1986)), and the 

15 



WO 93/04166 



PCT/US92/06748 



transcription factor NFKB (Kieran et al, Cell 62, 
1007 (1990) Bours et al, Nature 348, 76 (1990)). 

The two subunits of GABP exhibit primary 
sequence motifs typical of proteins normally found 
5 in different cellular compartments. Accordingly, 

transcriptional regulatory proteins, such as members 
of the Ets family, might interact with membrane 
bound proteins that contain the 33-amino acid 
repeats present in GABP8. The Notch, Glpl and Linl2 
10 proteins might sequester transcription factors at 
the plasma membrane which could be released in 
response to appropriate extracellular signaling 
events. Alternatively, the cytoplasmic segments of 
these transmembrane proteins might be proteolyzed in 
15 response to an extracellular signal, allowing the 
33-amino acid repeats to be translocated to the 
nucleus where they could abet the action of a second 
subunit. Either scenario would offer a direct 
pathway of signal transduction. 
20 The results set forth in the Examples that 

follow demonstrate the reliance of competent DNA 
binding complexes on multiple subunits. By 
separating functional components onto different 
polypeptide chains, critical subunits might be 
25 differentially expressed or sequestered, generating 
useful strategies for regulation. For example 
differential expression of the mRNAs encoding the Bl 
and B2 subunits of GABP would be expected to impact 
substantially on the function of the resulting 
30 complex. In this regard, it is interesting to note 
that in cells undergoing replication, the 62 subunit 
predominates whereas in non-dividing cells, subunit 
Bl predominates. 

It will be clear to one skilled in the art 
35 from a reading of this disclosure that advantage can 
be taken of information provided herein to effect 
alterations of both normal and abnormal expression 
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patterns regulated by the binding complexes 
described abpve. Such alterations can be effected, 
for example, using a vari ty of gen therapy 
protocols. It is contemplated that by altering the 
5 relative amounts of the Bl and 62 subunits, disease 
states characterized by rapid cell division, for 
example, cancer, can be controlled. 

Certain aspects of the invention are 
described in greater detail in the non-limiting 
10 Examples that follow. 

Example l 

Isolation of Recombinant cDNA Clones 

GABP (20 ng) was purified from rat liver 
nuclear extracts and cleaved with trypsin. 
Proteolyzed fragments were separated by high 
performance liquid chromatography (HPLC) , recovered, 
and subjected to gas-phase amino acid sequencing 
(rig. l). Partial sequences were derived from 
13 tryptic peptides. Degenerate oligonucleotides 
capable of encoding four of the thirteen peptide 
sequences were synthesized and used as hybridization 
probes to screen an adipocyte cDNA library. 
Degenerate oligonucleotides were labeled with "P 
using polynucleotide kinase. The basic sodium 
chloride/ sodium citrate (SSC) protocol was used for 
screening (F.M. Ausubel et al, Current Protocols in 
Molecular Biology (Wiley and Sons, NY), 1989). 
Hybridization was performed at 48°C for 16 hours. 
One recombinant bacteriophage that contained a cDNA 
insert of 2 kb gave a positive signal when 
hybridized with each of the four oligonucleotide 
probes . 

The insert of this r combinant was 
sequenced and found to contain an opening reading 
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frame that encoded a protein of 454 amino acids 
(Fig. 2A) . The predicted molecular weight of this 
polypeptide (51.3 kD) corresponded to the size of 
the GABPa subunit purified from rat liver nuclei 
5 (LaMarco et al, Genes Dev. 3, 1372 (1989); Fig. 1). 
Inspection of the deduced amino acid sequence 
revealed segments that corresponded to eight of the 
13 peptides isolated by trypsin digestion of intact 
GABP. On the basis of the latter two observations, 
10 this 454 residue polypeptide was tentatively 
identified as GABPa. 

Degenerate oligonucleotides capable of 
encoding two of the tryptic peptide sequences not 
present in GABPa were synthesized and used as 
15 hybridization probes to search for a cDNA clone that 
encoded GABPB (Lee, Mol. Endocrinol. 4, 1034 
(1990)). Kinased oligonucleotide probes were 
hybrided in 6X SSC, IX Denhardt's, 0.05% sodium 
pyrophosphate, and 100 Mg/ml yeast tRNA at 50 °C for 
20 14 hours. Washing conditions were 6X SSC, 0.05% 
sodium pyrophosphate at 55 °C. A total of five 
clones were isolated that hybridized with both 
oligonucleotide probes. Four of the clones were 
approximately 2.6 kb and differed only slightly in 
25 the length of the 5 1 untranslated region; these cDNA 
clones encoded GABPB 1. The fifth cDNA clone was 
approximately 1.4 kb and differed from the other 
four at its 3* end; this cDNA clone encoded GABPB2 . 
Four additional cDNA clones corresponding to GABPB 2 
30 were subsequently identified. Five recombinant 
bacteriophage were identified according to their 
capacity to hybridize with both oligonucleotide 
probes. One of the cDNA clones differed at the 3 1 
end from th other four. The largest cDNA insert of 
35 the four (2.6 kb) and the variant (1.4 kb) were 

sequenced (Fig* 2B) . Both DNA sequences r vealed 
long open reading frames specifying highly similar 
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polypeptid s. One cDNA encoded a protein of 382 
amino acids, the other encoded a protein of 349 
residues. Starting at their respective NH 2 -termini, 
the two proteins exhibited identical sequences for 
333 amino acids. At this point the sequences to the 
two protein diverged such that the longer one 
contained an additional 50 residues before its 
terminus. The divergent COOH-terminal segments bore 
no apparent amino acid sequence similarity. The 
open reading frames of both polypeptides contained 
segments that corresponded to the two tryptic 
peptides used to design hybridization probes. 
Moreover, the predicted molecular weights of the two 
polypeptides (41.3 and 37 kD) corresponded closely 
with the size of the GABPfi subunit purified from rat 
liver nuclei (LaMarco et al, Genes Dev. 3, 1372 
(1989); Fig. l). The 41 kD polypeptide was 
therefore provisionally designated as GABPfil and the 
37 kD polypeptide as GABJ32 . 

Example 2 

Tissue Distribution of mRNA Encoding 
GABPa, GABPfil and GABP62 

Northern (UNA) blot assays were used to 
determine the sizes and tissue distributions of mRNA 
encoding GABPa, GABPfil and GABP62 (Fig. 3) . The 
cDNA corresponding to GABPa identified three mRNAs 
of roughly 5.0, 2.8 and 2.6 kb, which were expressed 
in a variety of tissues. The GABPa cDNA, which 
consisted of slightly less than 2.0 kb (Fig. 2A) , 
represents a partial copy of any of the three mRNAs. 
Two mRNAs measuring 2.7 and 1.5 kb were identified 
in northern blots probed with GABPfil cDNA. Like 
GABPa mRNAs, those encoding GABPfil had a wide tissu 
distribution. Because the cDNAs that encoded GABPfil 
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and GABPB2 measured 2.6 and 1.4 kb, respectively 
(Fig. 2B) , they probably represent nearly full- 
length copies of the respective mRNAs. Moreover, 
because the nucleotide sequences of the two cDNAs 
5 are identical from their respective 5 1 termini to 
the point of abrupt divergence 1.1 3cb internal to 
the mRNA, they likely represent alternatively 
spliced transcripts derived from the same gene. 
Consistent with this interpretation is the presence 
10 of a potential splice donor site (AG dinucleotide) 
immediately preceding the point of divergence. 

Example 3 

GABP DNA Binding Activity 

To test whether the recombinant DNA clones 

15 described above possessed GABP DNA binding activity, 
reticulocyte lysates were programmed with RNA 
synthesized from the cDNAs that encode GABPB , GABPB1 
and GABPB2. Each RNA was translated to form a 
protein product of the expected size (Fig- 4A) . 

20 Individual lysates or mixtures thereof were tested 
for DNA binding to a fragment from the HSVl ICP4 
promoter that contained three GA repeats. 
Protein: DNA mixes were subjected to electrophoresis 
on nondenaturing polyacrylamide gels to separate 

25 free DNA from that complexed with protein. 

Reticulocyte lysate that had not been programmed 
with exogenous RNA contained protein (s) capable of 
forming a complex with the oligonucleotide probe 
that migrated more rapidly than the complex formed 

30 by GABP. Other than background activity endogenous 
to the reticulocyte lysate, specific protein: DNA 
complexes were not observed when lysates programmed 
with GABPa, GABPB 1, or GABPB 2 were tested in 
electrophoretic mobility shift assays. Likewise, no 
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new DNA binding activity was observed with lysat 
that had been us d to co-translate RNAs encoding 
GABPa and GABPB2. However, co-translation of RNAs 
encoding GABPa and GABPBl caused the lysate to form 
5 a DNA binding activity that could be distinguished 
from background (Fig. 4B) . The interdependency of 
GABPa and GABPBl observed in these assays is 
consistent with earlier observations that tested 
subunits purified from rat liver nuclei (LaMarco et 
10 al, Genes Dev. 3, 1372 (1989)). 

Example 4 

DNA Binding Properties of GABP 

Recombinant cDNA copies of the mRNAs that 
encode GABPa and GABPBl were introduced into 

15 bacteriophage T7 based vectors that allowed 
synthesis of the corresponding proteins in 
Escherichia coli (Studier et al, J. Mol. Biol., 189, 
113 (1986)). Polymerase chain reaction was used to 
introduce a Bam HI site at the 5' end of the open 

20 reading frames encoding GABPa or GABPBl. cDNAs 
lacking the 3 1 untranslated region were inserted 
into a modified pT5 vector, which adds two amino 
acids (gly-ser) at the NH 2 -terminus of the encoded 
protein. Each subunit was expressed and purified 

25 using conventional chromatographic techniques. 

Proteins were expressed in bacteria as described 
(Shuman et al, Science 249, 771 (1990)). GABPa was 
precipitated from the soluble fraction by the 
addition of one volume of 2M ammonium sulfate in 

30 buffer A (10 mM Tris-HCl, pH 7.9, 100 mM KC1, 0.5 mM 
EDTA, 1 mM MgCl a , 2 mM DTT, 0.1 mM 

phenylmethylsulfonyl fluoride) with 2 mM CaCl,. The 
ammonium sulfate pellet was resuspended in 25 ml 
buffer B (25 mM Tris-HCl, pH 8.0, 0.75 mM EDTA, 10% 
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(v/v) glycerol, 1 mM DTT) with 75 mM NaCl and 
dialyzed against the same buffer* The dialysate was 
loaded onto a column of Q-Sepharose Fast Flow 
(Pharmacia) . GABPa was eluted with a 75-500 mM NaCl 
5 gradient in buffer B. Peak fractions were pooled, 
dialyzed against buffer B and loaded onto a salmon 
sperm DNA-sepharose column. GABPa was eluted with a 
0-400 mM NaCl gradient. GABPa was judged by 
Coomassie Blue staining of SDS polyacrylamide gels 
10 to account for greater than 90% of the total 
protein. GABPB1 was solubilized from the 
particulate fraction of bacterial extracts by 
sonication in buffer A supplemented with 7 M urea. 
The urea solubilized fraction was dialyzed against 
15 buffer B with 75 mM NaCl and centrifuged at 16,300 x 
g for one hour. The supernatant was applied to a Q- 
Sepharose column and eluted with a gradient of 75- 
500 mM NaCl. GABPB1 was judged to account for 
greater than 90% of total protein by Coomassie Blue 
20 staining of SDS-polyacrylamide gels. 

The DNA binding properties of the two 
individual polypeptides and mixtures thereof were 
first studied by gel retardation using a DNA 
substrate derived from the enhancer of an immediate 
25 early gene of herpes simples virus. Consistent with 
earlier studies (LaMarco et al, Genes Dev., 3, 1372 
(1989)), binding was not observed when DNA was 
incubated with either of the isolated subunits. 
When GABPa and GABPB1 were incubated with DNA 
30 simultaneously, a DNA:protein complex exhibiting 

substantially retarded mobility relative to free DNA 
was observed (Fig. 6, left panel). 

The multi-subunit dependence of DNA 
binding by GABP was relieved in gel retardation 
35 assays conducted at lower ionic strength (Fig. 6, 
right panel) . GABPa formed two retarded complexes 
that migrated at positions between free DNA and the 
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complex formed with both subunits. Under these 
conditions, th mixture of GABPa and GABP61 again 
led to the formation of a slowly migrating 
DNA: protein complex. The retarded mobility of the 
5 latter complex, relative to those formed by GABPa 

alone, reflected the presence of the GABPfil subunit. 
The validity of this interpretation was confirmed by 
the use of antisera specific to each subunit. 
Antiserum specific to GABPa further retarded the 
10 migration of complexes formed with GABPa alone or 

the mixture of GABPa and GABPfil. Antiserum specific 
to GABPfil did not affect the mobility of complexes 
formed between GABPa DNA, but retarded the complex 
formed in the presence of both subunits. Polyclonal 
15 antisera were generated by rejecting rabbits with 
purified GABPa or GABPfil. Antisera were added to 
gel shift reactions at a dilution of 1:20. Pre- 
immune sera did not effect the migration of 
protein: DNA complexes. 
20 The HSVl-derived DNA fragment used in 

binding assays of GABP contains three imperfect 
repeats of the hexanucleotide sequence 5 1 -CGGAAR- 
3 f , which were shown in earlier studies to be 
protected from DNase I digestion when bound by GABP 
25 (Triezenberg et al, Genes Dev. 2 (1988); LaMarco et 
al, ibid. 3 1372 (1989)). DNase I footprinting 
assays were performed using bacterially synthesized 
proteins under conditions that allowed interaction 
of GABPa alone. As shown in Fig. 7A, GABPa was 
30 capable of protecting .the repeated hexanucleotide 
motifs from DNase I digestion when added at a 
concentration of 0.15 nM. When the GABPfil subunit 
was added, protection was observed at a 10-fold more 
dilute concentration of GABPa (0.015 nM) . In 
35 addition, th pattern of nuclease protection was 

extended slightly beyond the adenine residues of the 
third r peat. 
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The segment of DNA protected from DNase I 
digestion by GABP encompassed all three 
hexanucleotide repeats, yet was not centered over 
the repeats. Methylation protection and 
5 interference assays were undertaken in order to gain 
a more refined image of the sites of close contact 
established between GABP and DNA. Methylation 
protection assays conducted with GABPa showed a 
pattern of protection that included both guanine 

10 residues of the second and third hexanucleotide 

repeats . The same sets of guanines were protected 
when GABPB1 was added to the binding reaction. In 
the latter case, however, accentuated methylation 
was observed at adenine residues located adjacent to 

15 the guanine dinucleotides of the second and third 
repeats. Sites of methylation interference were 
mapped by separating protein-bound DNA molecules 
from those inactivated by partial methylation. 
Methylation of guanine dinucleotides in the second 

20 and third hexanucleotide repeats inhibited binding 
by the mixture of GABPa and GABPB1 (Fig. 7B) . 

The results of methylation protection and 
interference assays indicate that GABP binds to 
sites on DNA corresponding to two of the three 

25 purine-rich hexanucleotide repeats. The pattern of 
protection of guanine residues by GABPa was similar 
to that observed with the mixture of both subunits, 
indicating that GABPa, when added at a sufficiently 
high concentration, can bind specifically to DNA in 

30 the absence of GABPBl Such observations offer an 

explanation for the two retarded bands observed when 
DNA was challenged with GABPa alone under conditions 
of low ionic strength (Fig. 6, right panel). The 
less retarded of the two bands is interpreted to 

35 represent a complex wherein GABPa is associated with 
only one of the two hexanucleotide repeats, while 
the more retarded complex is interpreted to contain 
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GABPa subunits associated with two hexanucleotide 
repeats. Binding assays that test d DNA probes 
containing a single hexanucleotide repeat supported 
this interpretation. When incubated with GABPa and 
5 assayed in low ionic strength gels, such DNA probes 
generated only one retarded complex. 

Several observations indicate that the 
mixture of GABPa and GABPB1 forms a complex that 
binds DNA more stably than the a subunit alone 
10 (LaMarco et al, Genes Dev. 3, 1372 (1989)). To 
further investigate the effect of GABP61 on DNA 
binding by GABPa, the rate at which variously mixed 
proteins dissociate from DNA was measured (Fig. 8). 
The dissociation rate of GABPa alone was too rapid 
15 to be accurately measured. Less than ten percent of 
the DNA remained bound to GABPa after a 10 second 
challenge with excess, unlabeled competitor DNA. In 
contrast, when both GABPa and GABPB1 were present, 
the dissociation rate was much slower (T 1/2 =1.5 
20 min) . Similar assays were performed with a mixture ; 
of GABPa and GABP62 , which, in earlier experiments, 
failed to form a stable complex with DNA. When used 
at nM concentrations, GABP62 was capable of forming 
a DNA binding complex with GABPa (see Fig. 12A) . 
25 The B2 isoform of GABP also stabilized DNA binding 
by GABPa, yet yielded a complex that dissociated 
more rapidly (T 1/2 = 30 s) than that formed with 
GABPfil • Because the Bl and 62 isoforms differ only 
at their COOH-termini, this part of the protein may 
30 be involved in stabilizing DNA binding. 

The observations outlined thus far 
indicate that the Bl and B2 isoforms of GABP do not 
bind to DNA alone, but associate with the a subunit 
to augment DNA binding. The question arises, 
35 therefore, as to whether the £ subunits cause a 

conformational change in a leading to its more avid 
interaction with DNA. Alternatively, or in addition 
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to causing a conformational change, the B subunits 
might, in association with GABPa, establish direct 
contact with DNA. 

To determine whether GABPBl contacted DNA 
5 when complexed with GABPa, DNA: protein complexes 
were exposed to ultraviolet (UV) light under 
conditions expected to permit covalent crosslinking 
between DNA and intimately bound proteins (L. A. 
Chodosh in Current Protocols in Molecular Biology . 
10 vol II, F. M. Ausubel et al., eds. (Greene/Wiley, 
New York, 1988) . UV crosslinking was performed 
using an oligonucleotide composed of a GA binding 
site flanked by 10 bp of non-specific sequence ((5 1 
AACCAAGCT TGCGGAACGGAAGCGGAAACCG 3 ' ) corresponding to 

15 residues located between 280 and 300 bp upstream of 
the herpes simplex virus gene encoding ICP4. 
Oligonucleotides were labeled to high specific 
activity by fill-in reaction with the Klenow 
fragment of DNA polymerase I in the presence of all 

20 four 32 P-labeled dNTPs. DNA binding reactions were 
performed as described in a 96-well culture dish, 
followed by exposure to ultraviolet light. Samples 
were boiled in SDS-sample buffer and subjected to 
electrophoresis on SDS-polyacrylamide gels, 

25 Cross linked protein species were visualized by 

autoradiography. The GABP subunits were incubated 
with 32 P-labeled DNA that contained the purine-rich 
hexanucleotide repeats, exposed to UV light, and 
subject to electrophoresis on a denaturing 

30 polyacrylamide gel. When DNA was incubated with 

GABPa and exposed to UV light, a crosslinked product 
was observed bearing an electrophoretic mobility 
close to that of GABPa (Fig. 9) . The appearance of 
this product was dependent on the presence of GABPa 

35 and increased in a time-dependent manner upon 

exposure to UV light. Moreover, it was eliminated 
by the inclusion of excess, unlabeled DNA that 
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contained the purine-rich hexanucleotide repeats, 
but not by excess non-specific DNA. 

No evidence of protein: DNA crosslinking 
was observ d when 32 P-labeled DNA was mixed with 
GABPB1 and exposed to UV light. However, when the 
mixture of GABPa and GABPBl was complexed with DNA 
and irradiated, new crosslinked products were 
observed. In addition to GABPa , two closely 
migrating polypeptide bands, slightly larger than 
the native size of GABPBl , became covalently 
attached to the radioactive DNA substrate (Fig. 9) . 
Although GABPBl was incapable of binding DNA on its 
own, when present in a ternary complex it appeared 
to be even more susceptible to UV-mediated 
crosslinking than GABPa. These data provide 
evidence that the Bl subunit of GABP associates 
closely with DNA when complexed with GABPa. 

Example 5 

Formation of a Stable Complex between GABPa 
and GABPB in the Absence of DNA 

Having found that the a and B subunit s of 
GABP formed a heteromeric complex when exposed to 
their specific DNA substrate, gel filtration 
chromatography was used to determine whether these 
subunits might associate in the absence of DNA. Gel 
filtration chromatography was performed using a 
Superose-6 column (10 x 300 cm, Pharmacia) in buffer 
B supplemented with 0.4M NaCl. The column was 
calibrated with molecular weight standards 
thyroglobulin, apoferritin, catalase, bovine serum 
albumin and ribonuclease. 50-100 fig of each protein 
was chromatographed at 0.5 ml/min. Elution volume 
was converted to K av by the equation K 4V = (V e -V 0 )/V t - 
V 0 ) where V c = void volume = 8.1 ml; V t + total bed 
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volume = 24.0 ml; V = eluted volume. The Stokes 
radius was calculated f:::;m a plot of (-log K av ) ,/? 
versus Stokes radius (Ackers, Adv. Prot. Chem. 24, 
343 (1970)). GABPa eluted as a single peak at 15.2 
5 ml; GABPfil at 14.0 ml; GABP62 at 15.8 ml. A mixture 
of equal amounts of GABPa and Bl chromatographed as 
a single peak at 12.1 ml. The mixture of GABPa and 
GABPB2 chromatographed as a single peak at 13.9 ml. 
Loaded separately, GABPa and GABPBl eluted as single 

10 peaks at K av values of 0.45 and 0.38 respectively. 

However, when loaded together at an equimolar ratio, 
both subunits eluted as a single peak at K 4V 0.26 
(Table 1) . Analysis of column fractions by 
polyacrylamide gel electrophoresis confirmed that 

15 the peak at K av 0.2 6 contained both subunits. 
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TABLE 1 

Determination of molecular weights of GABP 
subunits. Purified GABPa, GABPB1 and GABP52 produced in 
E. coli were analyzed by gel filtration and sedimentation 
velocity as described (17,18). K av values were calculated 
from the elution volume of a Superose-6 FPLC column. 
Apparent molecular weights were determined from K av vs. log 
MW for the column. Stokes radii were determined from a 
plot of (-log K av )^. Sedimentation coefficients together 
with measured Stokes radii were used to calculate native 
molecular weights. 



Protein 


K BV 


Apparent MW 
(K av vs. 
log MW) 


Stokes 
Radius 


Sedimen- 
tation Corrected 
Coefficient MW 


GABPa 


0.449 


158,000 


49.3 


3.1 


66,000 


GABPfil 


0.377 


349,000 


63.0 


3.1 


82,000 


GABPB2 


0.486 


106,000 


44.0 


2.5 


46,000 


GABPa + 
GABPB1 


0.255 


1.3 X 10 6 


87.7 


4.5 


170,000 


GABPa + 


0.367 


390,000 


64.8 


3.8 


104,000 



GABPB2 
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Evidence confirming the stable association 
of GAEP subunits in the absence of DNA was also 
obtained from measurements of sedimentation velocity 
(Table 1) . The gel filtration and sedimentation 
5 properties of a protein or protein complex are 

affected both by size and molecular shape. However, 
by using the analytical methods of Siegel and Monty 
(Martin et al., J. Biol. Chem. 236, 1372 (1961)), it 
was possible to calculate native molecular weights 
10 of the various protein species. Sedimentation 
coefficients were determined on 4.5 ml 10-3 0% 
glycerol gradients in 25 mM Tris-HCl pH 8.0, 75 mM 
NaCl, 0.75 mM EDTA, 1 mM DTT. 30 /ig of each protein 
was loaded in 0.1 ml together with catalase, bovine 
15 serum albumin, and cytochrome c as internal 

standards. Gradients were centrifuged at 4°C for 40 
hours at 3 9,000 rpm. Fractions (0.25 ml) were 
collected and analyzed by SDS-PAGE with Coomassie 
blue staining. The S value for each sample was 
20 determined by its sedimentation relative to the BSA 
and cytochrome c standards. Native molecular 
weights were derived using the Stokes radius 
together with measured sedimentation coefficients as 
described in Siegel et al, (Biochim. Biophys. Acta 
25 112, 346 (1966)). Partial specific volume was 

calculated using the predicted amino acid sequences 
of each GABP subunit as described by Cohn et al, 
Proteins, Amino Acids and Peptides as Ions and 
Dipolar Ions, E. J. Cohn and J. T. Edsall, eds. 
30 (Reinhold Publishing Co., New York, 1943), pp. 370- 
381) . As shown in Table 1, the calculated molecular 
weights of GABPa and GABP62 corresponded closely to 
their predicted sizes (51.3 and 37 kD, 
respectively) . In contrast, GABPB1 exhibited a 
35 native molecular weight (82 kD) roughly twice its 
expected size (41.3 kD) . 
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The complex formed between GABPoc and 
GABPSl eluted from the gel filtration column prior 
to the largest molecular weight standard. Thus, the 
calculated molecular weight of this complex (170 kD) 
represents a provisional assignment. Since GABPfil 
existed as a stable dimer on its own, the very large 
complex formed between it and GABPa was tentatively 
identified as a tetramer composed of two molecules 
of each subunit. Polyacrylamide gel analysis of the 
constituents of the GABPa : GABP6 1 complex were 
consistent with this interpretation, showing that 
the subunits existed in equal stoichometries. This 
interpretation was also consistent with the 
properties of the complex formed between GABPa and 
the B subunits was interpreted to reflect the dimer 
forming property of GABPBl. 

Gel filtration and gradient sedimentation 
assays indicated that GABPBl might exist as a dimer. 
This interpretation was tested using glutaraldehyde 
crosslinking assays. Bacterially expressed GABPBl 
and GABPB2 were exposed to glutaraldehyde and 
subjected to electrophoresis on a denaturing 
polyacrylamide gel. Incubation of GABPBl with 
glutaraldehyde led to the formation of a second 
polypeptide band exhibiting an apparent molecular 
weight roughly double that of the monomeric form of 
the protein (Fig. 10) . Similar experiments 
conducted with GABPB2 failed to yield an analogous 
product. 

Additional evidence suggesting that GABPfil 
exists as a dimer resulted from crosslinking 
experiments with the intact polypeptide and a 
truncated form lacking the 110 NH 2 -terminal residues 
(BN110, see Fig. 12B) . Crosslinking of the 
truncated protein led to the formation of an 
additional species roughly double the size of the 
monomeric form. When the truncated protein was 
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mixed with intact GABP61 and exposed to 
glutar aldehyde, three cross linked protein species 
were observ d. Two species corresponded to 
crosslinked, homodimeric complexes that had been 
5 observed upon glutaraldehyde treatment of active 
GABPB1 and the NH 2 -terminal truncated derivative. 
The third species migrated between the presumed 
homodimeric forms and probably represented a 
heteromeric complex consisting of one GABPBl 
10 polypeptide and one truncated polypeptide. 

Therefore, both molecular weight measurements and 
crosslinking assays showed that GABPBl but not 
GABPB2 exists as a stable homodimer. 

Example 6 

15 Mapping of Functional Domains of GABPa and GABPB 

Experimental results described above 
indicate that GABPa should contain at least two 
functional components, one that facilitates DNA 
binding and another that allows complex formation 

20 with GABPB. The GABPBl polypeptide should contain 
at least three components, facilitating self- 
dimerization, heterodimerization with GABPa, and 
direct contact with some part of the purine-rich DNA 
substrate. Recombinant copies of the genes that 

25 encoded each subunit were systematically deleted to 
localize these components. Deletion mutants of 
GABPa were generated by polymerase chain reaction 
and expressed in pT5 as described (Breeden et al, 
Nature 329, 651 (1987)). Soluble bacterial extracts 

30 containing deleted variants of GABPa were used for 
binding reactions. NH 2 -terminal deletions of GABPBl 
were generated by exonuclease III digestion, 
followed by digestion with SI nuclease and ligation 
of Bam HI link rs. All deletions were sequenced and 
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subcloned into the appropriate pET3 vector 
(Rosenberg et,,al., Gene 56, 125 (1987)) to maintain 
the proper reading frame. COOH-terminal deletions 
were generated using 3 1 deletions of the cDNA 
5 inserted in Bluescript (Stratagene) by subcloning 

Bst EII-Asp 718 or Sac I -Asp 718 fragments into the 
pET-GABPfil plasmid that had been digested with the 
appropriate enzymes. Translation termination codons 
were provided by vector sequences so that in some 

10 cases extra amino acids are appended to the open 

reading frame. All GABP6 derivatives were insoluble 
and re-solubilized in 8M urea followed by dialysis 
against 10 mM Tris pH 8.0, 75 mM KC1 or NaCl, 1 mM 
DTT, 0.2mM PMSF, ImM benzamidine, 10% glycerol prior 

15 to use in binding reactions. All derivatives were 
expressed equivalently as determined by Coomassie 
staining of SDS polyacrylamide gels. 

Deletion variants of GABPa that were 
missing as many as 313 residues from the NH 2 -terminus 

20 retained the capacities to bind DNA and complex with 
GABPfi (Fig. 11) . A GABPa variant further missing 17 
residues (aN313/437) from the COOH-terminus also 
retained both functions. More extensive deletion 
from the COOH-terminus , to amino acid 407 

25 (aN313/C407) , yielded a protein that was capable of 
binding to DNA, but had lost the ability to complex 
with GABPS1 . These results showed that the Ets- 
related domain of GABPa was sufficient for DNA 
binding. The region of GABPa required to form a 

30 complex with GABPfi included the Ets -related segment , 
as well as 37 amino acids located on the immediate 
COOH-terminal side of the Ets-related domain. 

In order to define regions of GABPfi 1 that 
interact with GABPa and contact DNA, systematically 

35 deleted variants were produced and tested in gel 
retardation and UV-cross linking assays (Fig. 12). 
Variants that lacked up to 228 residues (6C154) from 
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the COOH- terminus of GABPB1 proved to be functional 
in both assays. Deletion of an additional 33 
residues (6C121) yielded a protein that failed to 
function in either assay. The boundary defined by 
5 these experiments corresponded to the location of 

the most COOH-terminal of four 3 3 -amino acid repeats 
that are present in both isoforms of GABPB. 

Although about 70% of GABPB1 could be 
deleted from its COOH-terminus without eliminating 

10 interaction with GABPa and DNA, removal of only a 
small segment from the NH 2 - terminus resulted in 
deleterious effects. A variant of GABP61 that 
lacked 19 NH 3 -terminal residues (BN19) was slightly 
less effective in converting GABPa-derived complexes 

15 into the very slowly migrating heteromeric complex. 
When tested in the UV crosslinJcing assay, BN19 
yielded a reduced amount of crosslinked product 
relative to the intact Bl isoform. Variants that 
lacked 47 and 67 residues (BN47 and BN67) were 

20 progressively more defective in the complex 

formation assay and failed to be crosslinked to the 
radioactive DNA probe as efficiently as the intact 
protein. Finally, variants missing 80 or more 
residues from the NH 2 -terminus were completely 

25 defective in both assays. The progressive loss of 
function observed in deleted forms of GABPB1 
corresponded to the progressive loss of the 33-amino 
acid repeats (Fig. 12). BN19 was truncated within 
the first of the repeats, BN47 within the second, 

30 BN67 after the second,, and BN80 within the third 
repeat. The functional properties of the GABPfil 
deletion mutants indicate that the 3 3 -amino acid 
repeats are important both for complex formation 
with GABPa and DNA contact. 
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Example 7 

A Model for DNA-Bound GABP 

The foregoing observations have been 
incorporated into a provisional model of the complex 
5 formed when GABPa and GABPfil associate with a 

directly repeated set of purine-rich hexanucleotides 
(Fig. 13) . Each hexanucleotide repeat is 
hypothesized to be contacted by both GABPa and 
GABPfil. The linear order of contact, wherein GABPa 

10 is associated with guanines on one side of each 

hexanucleotide and GABP6 with adenines on the other 
side, was deducted from three separate observations. 
First, GABPa was alone capable of protecting both 
guanines from methylation by dimethysulf ate. 

15 Second, the DNase I footprint generated by the mixed 
subunits, relative to that resulting from GABP& 
alone, was extended in a direction toward the 
adenine residues of the hexanucleotide repeat. 
Third, addition of GABP6 caused an enhanced pattern 

20 of methylation of adenine residues relative to the 
pattern generated in binding reactions that 
contained GABPa alone. 

The most stable complexes formed between 
GABP and DNA were observed with the mixture of GABPa 

25 and GABPfil . The 61 subunit, unlike 62, was observed 
to exist as a stable homodimer. Moreover, when 
mixed with GABPa, the 61 subunit generated a high 
molecular weight complex probably consisting of two 
polypeptides of each subunit. This heteromeric 

30 tetramer is believed to bind in a concerted manner 
to two purine-rich hexanucleotide repeats. If that 
is the case, a flexible region should exist between 
the dimerization domain of GABPfil and the surfaces 
located near its NH 2 -terminus that facilitate 

35 interaction with GABPa and DNA. Without such 
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flexibility a linked set of polypeptides would not 
likely be capable of binding simultaneously to a DNA 
substrate that is not rotationally symmetric. 

Example 8 

5 Isolation of cDNA Clones Encoding Human GABP 

Alpha and Beta Subunits 

cDNA clones encoding human GABP alpha, 
beta3 and beta4 were isolated by screening a human 
fetal brain cDNA library. cDNAs for human GABP 

10 betal and beta2 were isolated from a HeLa cell cDNA 
library. The probes used to screen for human GABP 
alpha were the 865 bp Ava 1-Sst 1 and 678 bp Bam HI- 
Sst 1 fragments of the mouse GABP alpha cDNA. The 
probe used to isolate the human beta was an 850 base 

15 pair fragment from the 5' end of the mouse GABP 
beta2 cDNA. 

Purified DNA fragments used as probes were 
radiolabeled with 32P by random priming reactions. 
Hybridization conditions were 6X SSC, IX Denhardt's, 

20 0.05% sodium pyrophosphate, 100 Mg/*il yeast tRNA. 
The final wash buffer was 2 X SSC. The 
hybridization and washing temperature were 65° C. 

cDNAs were confirmed as the human homologs 
of mouse GABP by determination of their nucleotide 

25 sequence. 

Six different cDNAs have been deposited 
with the American Type Culture Collection (ATCC) , 
12301 Parklawn Drive, Rockville, MD 20852. 

1A = human GABPB, Eco Rl fragment common 
30 to all beta isoforms, in Bluescript KS+, isolated 
from HeLa cDNA library. 

4A = human GABPB 1 Eco Rl fragment of 
Bluescript KS+, isolated from HeLa cDNA library. 
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5A - human GABP62 , Eco Ri fragment in 
Bluescript KS-f, isolated from HeLa cDNA library* 

F = human GABPB3, Eco Rl-Xho 1 fragment in 
Bluescript SK-, isolat d from human fetal brain cDNA 
5 library. 

J = human GABPB4, Eco Rl- Xho 1 fragment 
in Bluescript SK-, isolated from human fetal brain 
cDNA library. 

G = human GABPa, Eco Rl- Xho l fragment in 
0 Bluescript SK-, from human fetal brain cDNA library. 

* * * * 



The entire contents of all documents cited 
herein are hereby incorporated by reference. 

While various aspects of the invention 
have been described in some detail for purposes of 
clarity and understanding, one skilled in the art, 
from a reading of this disclosure will appreciate 
that various changes can be made in form and detail 
without departing from the true scope of the 
invention. One skilled in the art will also 
appreciate that the invention includes the human 
counterparts of the sequences specifically disclosed 
herein as well as the encoded amino acid sequences, 
and fragments thereof. The invention also relates 
to the deposited human sequences and fragments 
thereof as well as to the full length sequences of 
which the partial sequences form a part. 
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5 WHAT IS CLAIMED IS : 

1. A DNA segment encoding a subunit of 
GA binding protein (GABP) , or an epitope specific 
10 thereto, or a DNA fragment complementary to said DNA 
segment. 



15 



25 



45 



2. The DNA segment according to claim 1, 
wherein said GABP is human GABP. 



3. The DNA segment according to claim 1 
20 wherein said subunit is GABPcr . 



4. The DNA segment according to claim 1 
wherein said subunit is GABPB1 . 

5. The DNA segment according to claim 1 
wherein said subunit is GABPB2. 



30 6. The DNA segment according to claim 3 

wherein said subunit has the amino acid sequence 
shown in Figure 2A. 

35 7. The DNA segment according to claim 4 

wherein subunit has the amino acid sequence shown in 
Figure 2B. 

40 8. The DNA segment according to claim 5 

wherein said subunit has the amino acid sequence 
shown in Figure 2B. 



9. A recombinant DNA molecule 
comprising: 

i) said DNA segment according to claim 1; 

and 

ii) a vector. 
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5 10. A host cell stably transformed with 

said recombinant DNA molecule according to claim 9. 

11. The host cell according to claim 10 
10 wherein said cell is a procaryotic cell. 

12. The host cell according to claim 10 
15 wherein said cell is a eucaryotic cell. 



13. A method of producing a recombinant 
2Q GABP subunit protein, or portion thereof defining at 
least an epitope specific thereto, comprising 
culturing said host cell according to claim 10 under 
conditions such that said segment is expressed and 
25 said protein thereby produced, and isolating said 
protein. 
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FIG. 4A 
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